Compression of High-dimensional Data Spaces Using Non-differential Augmented Vector Quantization
نویسندگان
چکیده
Most data-intensive applications are confronted with the problems of I/O bottleneck, poor query processing times and space requirements. Database compression alleviates this bottleneck, reduces disk space usage, improves disk access speed, speeds up query response time, reduces overall retrieval time and increases the effective I/O bandwidth. However, random access to individual tuples in a compressed database is very difficult to achieve with most of the available compression techniques. This paper reports a lossless compression technique called non-differential augmented vector quantization. The technique is applicable to a collection of tuples and especially effective for tuples with numerous low to medium cardinality fields. In addition, the technique supports standard database operations, permits very fast random access and atomic decompression of tuples in large collections. The technique maps a database relation into a static bitmap index cached access structure. Consequently, we were able to achieve substantial savings in space by storing each database tuple as a bit value in the computer memory. Important distinguishing characteristics of our technique are that tuples can be compressed and decompressed individually rather than a full page or entire relation at a time. Furthermore, the information needed for tuple compression and decompression can reside in the memory. Possible application domains of this technique include decision support systems, statistical and life databases with low cardinality fields and possibly no text fields.
منابع مشابه
Compressing High - Dimensional Data Spaces Using Non - Differential Augmented Vector Quantization
Most data-intensive applications are confronted with the problems of I/O bottleneck, poor query processing times and space requirements. Database compression has been discovered to alleviate the I/O bottleneck, reduce disk space, improve disk access speed, speed up query, reduce overall retrieval time and increase the effective I/O bandwidth. However, random access to individual tuples in a com...
متن کاملJoint Image Compression and Classification with Vector Quantization and a Two Dimensional Hidden Markov Model
We present an algorithm to achieve good compression and classification for images using vector quantization and a two dimensional hidden Markov model. The feature vectors of image blocks are assumed to be generated by a two dimensional hidden Markov model. We first estimate the parameters of the model, then design a vector quantizer to minimize a weighted sum of compression distortion and class...
متن کاملSemantic Database Compression System Based on Augmented Vector Quantization
In the last years, that amount of data stored in databases has increased extremely with the widespread use of databases and the rapid adoption of information systems and data warehouse technologies. It is a challenge to store and recover this increased data in an efficient method. This challenge will potentially appeal in database systems for two causes: storage cost reduction and performance i...
متن کاملVector Quantization Parallelization
Quantization is the process of representing a large set of input values with a much smaller set. In signal processing and image processing, Vector Quantization is a classical quantization which extends the scalar quantization to multi-dimensional space. It is widely used in many applications such as data compression, data correction, pattern recognition, and density estimation. This project pro...
متن کاملMultiresolution Model Compression Using 3-D Wavelets
Three-dimensional (3-D) objects are often represented by geometric models in applications dealing with virtual reality, augmented reality, and cyberspace. Surface representations can provide an effective visualization of these objects. Polygonal models are the most prevalent type of surface representation. Recently, multiresolution representation (surface simplification) of polygonal models has...
متن کامل